Propagating Both Trust and Distrust with Target Differentiation for Combating Web Spam

نویسندگان

  • Xianchao Zhang
  • You Wang
  • Nan Mou
  • Wenxin Liang
چکیده

Propagating trust/distrust from a set of seed (good/bad) pages to the entire Web has been widely used to combat Web spam. It has been mentioned that a combined use of good and bad seeds can lead to better results. However, little work has been known to realize this insight successfully. A serious issue of existing algorithms is that trust/distrust is propagated in non-differential ways. However, it seems to be impossible to implement differential propagation if only trust or distrust is propagated. In this paper, we view that each Web page has both a trustworthy side and an untrustworthy side, and assign two scores to each Web page: T-Rank, scoring the trustworthiness, and D-Rank, scoring the untrustworthiness. We then propose an integrated framework which propagates both trust and distrust. In the framework, the propagation of T-Rank/DRank is penalized by the target’s current D-Rank/T-Rank. In this way, propagating both trust and distrust with target differentiation is implemented. The proposed Trust-Distrust Rank (TDR) algorithm not only makes full use of both good seeds and bad seeds, but also overcomes the disadvantages of both existing trust propagation and distrust propagation algorithms. Experimental results show that TDR outperforms other typical anti-spam algorithms under various criteria.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Propagating Trust and Distrust to Demote Web Spam

Web spamming describes behavior that attempts to deceive search engine’s ranking algorithms. TrustRank is a recent algorithm that can combat web spam by propagating trust among web pages. However, TrustRank propagates trust among web pages based on the number of outgoing links, which is also how PageRank propagates authority scores among Web pages. This type of propagation may be suited for pro...

متن کامل

A Novel Approach to Propagating Distrust

Trust propagation is a fundamental topic of study in the theory and practice of rankingand recommendation systems on networks. The Page Rank [9] algorithm ranks web pagesby propagating trust throughout a network, and similar algorithms have been designed forrecommendation systems. How might one analogously propagate distrust as well? This is aquestion of practical importance and...

متن کامل

Link-Based Similarity Search to Fight Web Spam

We investigate the usability of similarity search in fighting Web spam based on the assumption that an unknown spam page is more similar to certain known spam pages than to honest pages. In order to be successful, search engine spam never appears in isolation: we observe link farms and alliances for the sole purpose of search engine ranking manipulation. The artificial nature and strong inside ...

متن کامل

A Survey on Web Spam Detection Methods: Taxonomy

Web spam refers to some techniques, which try to manipulate search engine ranking algorithms in order to raise web page position in search engine results. In the best case, spammers encourage viewers to visit their sites, and provide undeserved advertisement gains to the page owner. In the worst case, they use malicious contents in their pages and try to install malware on the victim’s machine....

متن کامل

Web Spam, Propaganda and Trust

Web spamming, the practice of introducing artificial text and links into web pages to affect the results of searches, has been recognized as a major problem for search engines. It is also a serious problem for users because they are not aware of it and they tend to confuse trusting the search engine with trusting the results of a search [16]. The parallels between web spamming on the internet a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011